Concept
lifelong reinforcement learning
Parents
Children
CompositionalityModularityMulti-task LearningReinforcement Learning (Computer Engineering)Reinforcement Learning (Educational Psychology)
1.4K
Publications
86.9K
Citations
3.9K
Authors
943
Institutions
Continual Reinforcement Learning Transfer
2013 - 2019
During this era, reinforcement learning research increasingly focused on learning across sequences of tasks by integrating modular and hierarchical architectures, expert gating, and progressive task curricula. These patterns supported transfer of knowledge and reduced catastrophic forgetting as agents encountered varied environments and objectives; emphasis fell on memory-efficient replay strategies, task-aware memory management, and structured knowledge reuse across tasks. Exploration and representation learning were enhanced through intrinsic motivation, stochastic perturbations, and entropy-regularized objectives, driving robust and diverse behaviors in progressively richer environments, often with cross-domain challenges. Researchers embraced scalable architectures such as modular networks, networks of experts, and differentiable planning to enable seamless transfer and growth, while cross-domain transfer and successor-feature formalisms helped generalize policies across domains.
• Continual/lifelong RL builds sequential task capabilities by modular networks, expert gating, and curriculum-style task progression, enabling knowledge to transfer and avoid forgetting across tasks [6], [5], [10], [19].
• Optimizing memory usage in RL via prioritized sampling [1], curated replay databases [20], and hierarchical replay [16] to improve sample efficiency and knowledge reuse across tasks [17].
• Exploration enhancements combine intrinsic motivation [2], stochastic weight perturbations [18], and entropy-based objectives [4] to drive diverse behaviors and more reliable learning in RL, with environmental challenges highlighted by rich environments [11].
• Hierarchical and modular architectures enable scalable transfer across tasks via temporal abstraction and planning modules [2], progressive networks [6], network-of-experts [5], and differentiable planning [9], with multi-domain dialogue [8] illustrating cross-domain application.
• Cross-domain transfer is formalized with successor features and generalized policy improvement [13], zero-shot transfer from task features [3], and cross-domain lifelong transfer RL [19], with hierarchical replay supporting transfer [16].
Continual Lifelong Reinforcement Learning
2020 - 2023